igraph tutorial

Author

Antoine Bodein

Published

November 10, 2022

Here is a quick presentation of the igraph package. Some functions of the package are exemplified on toy networks.

Throughout this tutorial, you will be reproducing some manipulations with the got1 network.

The GOT (got1) dataset contains the Game of Thrones characters and their interactions in the first season.

There are five interaction types. Character A and Character B are connected whenever:

Start

load the library igraph

library(igraph)
library(tidyverse)
load("got1.Rda") # /home/public/network/got1.Rda

My first graph

We can build a graph from a list of edges

g1 <- graph( edges=c("A", "B",
                     "B", "C",
                     "C", "A"), directed=F ) 

plot(g1)

class(g1)
[1] "igraph"
g1
IGRAPH 22f5385 UN-- 3 3 -- 
+ attr: name (v/c)
+ edges from 22f5385 (vertex names):
[1] A--B B--C A--C
g2 <- graph( edges=c("A", "B",
                     "B", "C",
                     "C", "A"), directed=T )
g2
IGRAPH fae42fb DN-- 3 3 -- 
+ attr: name (v/c)
+ edges from fae42fb (vertex names):
[1] A->B B->C C->A
plot(g2)

g3 <- graph( c("John", "Jim", "Jim", "Jack", "Jim", "Jack", "John", "John"), 

             isolates=c("Jesse", "Janis", "Jennifer", "Justin") ) 
plot(g3)

Edges, Vertices, etc…

ecount(g3)
[1] 4
E(g3)
+ 4/4 edges from 21adb3a (vertex names):
[1] John->Jim  Jim ->Jack Jim ->Jack John->John
class(E(g3))
[1] "igraph.es"
vcount(g3)
[1] 7
V(g3)
+ 7/7 vertices, named, from 21adb3a:
[1] John     Jim      Jack     Jesse    Janis    Jennifer Justin  
class(V(g3))
[1] "igraph.vs"
Your turn

Display the nodes and the edges of got1. How many nodes and edges are there?

Attributes

Add attributes to nodes

V(g3)$name # automatically generated when we created the network.
[1] "John"     "Jim"      "Jack"     "Jesse"    "Janis"    "Jennifer" "Justin"  
V(g3)$gender <- c("male", "male", "male", "male", "female", "female", "male") 
# also works with 
g3 <- set_vertex_attr(graph = g3, name = "new_attribute", value = 1:7)

Get attributes

vertex_attr(g3)
$name
[1] "John"     "Jim"      "Jack"     "Jesse"    "Janis"    "Jennifer" "Justin"  

$gender
[1] "male"   "male"   "male"   "male"   "female" "female" "male"  

$new_attribute
[1] 1 2 3 4 5 6 7

Edges attributes

E(g3)$type <- "email" # Edge attribute, assign "email" to all edges
E(g3)$weight <- 10    # Edge weight, setting all existing edges to 10

edge_attr(g3)
$type
[1] "email" "email" "email" "email"

$weight
[1] 10 10 10 10
Your turn

Which attributes have the nodes ? Which attributes have the edges ?

Import from data.frame

Let’s build a data.frame with 2 columns (from and to)

set.seed(145) # fix random
data.set <- data.frame(from = sample(LETTERS[1:10], size = 20, replace = TRUE), 
                       to =   sample(LETTERS[1:10], size = 20, replace = TRUE))
g <- igraph::graph_from_data_frame(data.set, directed = FALSE)
g
IGRAPH 80511dd UN-- 10 20 -- 
+ attr: name (v/c)
+ edges from 80511dd (vertex names):
 [1] B--H F--G B--D A--I B--F E--I E--E I--J A--C F--I E--G B--F E--I H--J F--E
[16] D--E B--J A--H B--G A--I
plot(g)

from an adjacency matrix

mat <- matrix(sample(0:1, size = 100, replace = TRUE), ncol = 10)
colnames(mat) <-  LETTERS[1:10]
rownames(mat) <- LETTERS[1:10]
mat
  A B C D E F G H I J
A 1 1 0 1 1 1 0 1 0 1
B 0 0 0 1 0 0 1 1 0 0
C 0 0 0 0 1 0 1 0 0 0
D 0 1 0 0 1 1 0 0 0 1
E 1 0 0 0 1 1 0 0 1 1
F 0 0 0 1 1 0 1 1 1 0
G 0 1 0 1 1 1 1 1 1 1
H 1 1 1 1 1 0 1 0 1 1
I 0 0 0 0 0 0 1 0 0 1
J 1 1 1 0 1 0 0 0 0 0
g <- igraph::graph_from_adjacency_matrix(mat)
g
IGRAPH ecb7b31 DN-- 10 48 -- 
+ attr: name (v/c)
+ edges from ecb7b31 (vertex names):
 [1] A->A A->B A->D A->E A->F A->H A->J B->D B->G B->H C->E C->G D->B D->E D->F
[16] D->J E->A E->E E->F E->I E->J F->D F->E F->G F->H F->I G->B G->D G->E G->F
[31] G->G G->H G->I G->J H->A H->B H->C H->D H->E H->G H->I H->J I->G I->J J->A
[46] J->B J->C J->E
plot(g)

Plot

plot(...)

Plotting with igraph: the network plots have a wide set of parameters you can set. Those include node options (starting with vertex.) and edge options (starting with edge.). A list of selected options is included below, but you can also check out ?igraph.plotting for more information.

NODES
vertex.color  Node color
vertex.frame.color  Node border color
vertex.shape  One of “none”, “circle”, “square”, “csquare”, “rectangle”
 “crectangle”, “vrectangle”, “pie”, “raster”, or “sphere”
vertex.size  Size of the node (default is 15)
vertex.size2  The second size of the node (e.g. for a rectangle)
vertex.label  Character vector used to label the nodes
vertex.label.family  Font family of the label (e.g.”Times”, “Helvetica”)
vertex.label.font  Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol
vertex.label.cex  Font size (multiplication factor, device-dependent)
vertex.label.dist  Distance between the label and the vertex
vertex.label.degree  The position of the label in relation to the vertex,
 where 0 right, “pi” is left, “pi/2” is below, and “-pi/2” is above
EDGES
edge.color  Edge color
edge.width  Edge width, defaults to 1
edge.arrow.size  Arrow size, defaults to 1
edge.arrow.width  Arrow width, defaults to 1
edge.lty  Line type, could be 0 or “blank”, 1 or “solid”, 2 or “dashed”,
 3 or “dotted”, 4 or “dotdash”, 5 or “longdash”, 6 or “twodash”
edge.label  Character vector used to label edges
edge.label.family  Font family of the label (e.g.”Times”, “Helvetica”)
edge.label.font  Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol
edge.label.cex  Font size for edge labels
edge.curved  Edge curvature, range 0-1 (FALSE sets it to 0, TRUE to 0.5)
arrow.mode  Vector specifying whether edges should have arrows,
 possible values: 0 no arrow, 1 back, 2 forward, 3 both
OTHER
margin  Empty space margins around the plot, vector with length 4
frame  if TRUE, the plot will be framed
main  If set, adds a title to the plot
sub  If set, adds a subtitle to the plot

The first way to modify the default plot is to include those parameters inside plot(…).

plot(g, edge.arrow.size=.2, edge.curved=0,
     vertex.color=c(1,2,1,2,1,2,1,2,1,2), 
     vertex.frame.color="red",
     vertex.label.cex=.7,
     edge.color="blue",
     main = "Network")

Or you can set plotting parameters as graph attributes:

V(g)$size <- 20
V(g)$frame.color <- "white"
V(g)$color <- "blue"
V(g)$label.color <- "white"

vertex_attr(g)
$name
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"

$size
 [1] 20 20 20 20 20 20 20 20 20 20

$frame.color
 [1] "white" "white" "white" "white" "white" "white" "white" "white" "white"
[10] "white"

$color
 [1] "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue"

$label.color
 [1] "white" "white" "white" "white" "white" "white" "white" "white" "white"
[10] "white"
E(g)$color <- "red"
E(g)$lty <- 4

plot(g)

Personal favourite

Graph attributes can be casted from list to data.frame.

new_vertex_attr <- vertex_attr(g) %>% as.data.frame() %>% 
  mutate(color = ifelse(name == "A", "green", "pink"))
new_vertex_attr
 vertex_attr(g) <- new_vertex_attr  # %>% as.list()
 plot(g)

Your turn

Plot the graph got1.

Modify the vertex attribute to add the sex (/home/public/network/got_sex.csv).

Change the node color based on the sex.

Network layouts

Network layouts are simply algorithms that return coordinates for each node in a network.

Let’s generate a random graph

net.bg <- sample_pa(80) 
V(net.bg)$size <- 8
V(net.bg)$frame.color <- "white"
V(net.bg)$color <- "orange"
V(net.bg)$label <- "" 
E(net.bg)$arrow.mode <- 0
plot(net.bg)

Then use any layout_* in the layout parameter which tells the position of each node.

plot(net.bg, layout=layout_randomly)

However, each time you plot the graph with layout, new coordinates are given.

par(mfrow=c(2, 2), mar=c(1,1,1,1))
plot(net.bg, layout=layout_randomly, main = "1")
plot(net.bg, layout=layout_randomly, main = "2")
plot(net.bg, layout=layout_randomly, main = "3")
plot(net.bg, layout=layout_randomly, main = "4")

Alternatively, you can set it in advance

par(mfrow=c(2, 2), mar=c(1,1,1,1))
l <- layout_randomly(net.bg)
plot(net.bg, layout=l, main = "1")
plot(net.bg, layout=l, main = "2")
plot(net.bg, layout=l, main = "3")
plot(net.bg, layout=l, main = "4")

Some other layouts

par(mfrow=c(2, 2), mar=c(1,1,1,1))
plot(net.bg, layout=layout_in_circle)
plot(net.bg, layout=layout_on_sphere)
plot(net.bg, layout=layout_on_grid)
plot(net.bg, layout=layout_with_fr)

Your turn

Plot the got1 network and test different layout.

Graph Topology

Centrality

Degree

V(g)
+ 10/10 vertices, named, from ecb7b31:
 [1] A B C D E F G H I J
degree(g)
 A  B  C  D  E  F  G  H  I  J 
11  8  4  9 13  9 14 12  6 10 
hist(degree(net.bg))

# compare to power law
po <- function(x,k,a){
  a*x^-k
}
plot(x = 1:20, y = po(x = 1:20,k = 3, a = 2), type = "l")

Your turn

Plot the degree distribution. What is the highest degree? Which character has the highest degree?

Betweenness

be <- betweenness(net.bg)
be
 [1]  0 41  6  4 15  0 20 22 18  3  0  0  0  0  0  4  6  2  0  6  0  4  0  0  0
[26]  0  0  0  1  2  0  1  0  0  2  0  3  0  0  1  0  0  0  0  2  0  0  2  0  3
[51]  3  0  0  3  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
[76]  0  0  0  0  0

Color the node based on the betweenness score.

Easy color gradient

# from red to green
my_palette <- grDevices::colorRampPalette(c("green","red"))

# set the number of color
nb_color <- 5

# a quick plot
plot(rep(1,nb_color),col=my_palette(5), pch=19,cex=3)

# with cut, similar value have similar color
my_df <- data.frame(value = be,
                    color = cut(be, nb_color, labels = my_palette(nb_color)) %>% 
                      as.character) # warning: cut create factor
head(my_df)
plot(net.bg, vertex.color = my_df$color)

Plot the distribution of the betweenness centrality. What is the highest degree? Which character has the highest degree?

Plot the network. With a color gradient, color each node to represent its betweenness score. ,

Pathology

library(networkdata)
data("starwars", package = "networkdata") 
# Scene Co-occurrence of Star Wars Characters (Episode 1-7)

st4 <- starwars[[4]]  # Episode 4

st4
IGRAPH 99191b2 UNW- 21 60 -- Episode IV – A New Hope
+ attr: name (g/c), name (v/c), height (v/n), mass (v/n), hair_color
| (v/c), skin_color (v/c), eye_color (v/c), birth_year (v/n), sex
| (v/c), homeworld (v/c), species (v/c), weight (e/n)
+ edges from 99191b2 (vertex names):
 [1] R2-D2      --CHEWBACCA R2-D2      --C-3PO     R2-D2      --BERU     
 [4] R2-D2      --LUKE      R2-D2      --OWEN      R2-D2      --OBI-WAN  
 [7] R2-D2      --LEIA      R2-D2      --BIGGS     R2-D2      --HAN      
[10] CHEWBACCA  --OBI-WAN   CHEWBACCA  --C-3PO     CHEWBACCA  --LUKE     
[13] CHEWBACCA  --HAN       CHEWBACCA  --LEIA      LUKE       --CAMIE    
[16] CAMIE      --BIGGS     LUKE       --BIGGS     DARTH VADER--LEIA     
+ ... omitted several edges
as_adjacency_matrix(st4, attr = "weight")
21 x 21 sparse Matrix of class "dgCMatrix"
                                                            
R2-D2        .  3 17 14 . . 1  5 1 1  4 . .  6 . . . . . . .
CHEWBACCA    3  .  4 14 . . .  8 . .  4 . . 19 . . . . . . .
C-3PO       17  4  . 18 . . 1  6 2 2  6 . .  6 . . . . . 1 .
LUKE        14 14 18  . . 2 4 17 3 3 19 . . 26 . . 1 1 2 3 1
DARTH VADER  .  .  .  . . . .  1 . .  1 1 7  . . . . . . . .
CAMIE        .  .  .  2 . . 2  . . .  . . .  . . . . . . . .
BIGGS        1  .  1  4 . 2 .  1 . .  . . .  . . . . 1 2 3 .
LEIA         5  8  6 17 1 . 1  . 1 .  1 1 1 13 . . . . . 1 .
BERU         1  .  2  3 . . .  1 . 3  . . .  . . . . . . . .
OWEN         1  .  2  3 . . .  . 3 .  . . .  . . . . . . . .
OBI-WAN      4  4  6 19 1 . .  1 . .  . . .  9 . . . . . . .
MOTTI        .  .  .  . 1 . .  1 . .  . . 2  . . . . . . . .
TARKIN       .  .  .  . 7 . .  1 . .  . 2 .  . . . . . . . .
HAN          6 19  6 26 . . . 13 . .  9 . .  . 1 1 . . . . .
GREEDO       .  .  .  . . . .  . . .  . . .  1 . . . . . . .
JABBA        .  .  .  . . . .  . . .  . . .  1 . . . . . . .
DODONNA      .  .  .  1 . . .  . . .  . . .  . . . . 1 1 . .
GOLD LEADER  .  .  .  1 . . 1  . . .  . . .  . . . 1 . 1 1 .
WEDGE        .  .  .  2 . . 2  . . .  . . .  . . . 1 1 . 3 .
RED LEADER   .  .  1  3 . . 3  1 . .  . . .  . . . . 1 3 . 1
RED TEN      .  .  .  1 . . .  . . .  . . .  . . . . . . 1 .
par(mar=c(1,1,1,1))
plot(st4, layout=layout_nicely, vertex.label.cex = 0.5, vertex.shape="none", 
     edge.color = "grey")

shortest path

sp <- shortest_paths(graph = st4, 
               from = V(st4)[name == "LUKE"],
               to =V(st4)[name == "DARTH VADER"],output = "both", 
               weights=rep(1,ecount(st4)))  # or NA
sp 
$vpath
$vpath[[1]]
+ 3/21 vertices, named, from 99191b2:
[1] LUKE        LEIA        DARTH VADER


$epath
$epath[[1]]
+ 2/60 edges from 99191b2 (vertex names):
[1] LUKE       --LEIA DARTH VADER--LEIA


$predecessors
NULL

$inbound_edges
NULL
# plot
## highlight path
ecol <- rep("gray80", ecount(st4))
ecol[unlist(sp$epath)] <- "orange"

## highlight nodes
esize <- rep(2, ecount(st4))
esize[unlist(sp$epath)] <- 4

# Generate node color variable to plot the path:
vcol <- rep("gray80", vcount(st4))
vcol[unlist(sp$vpath)] <- "gold"

vsize <-  rep(1, vcount(st4))
vsize[unlist(sp$vpath)] <- 10

par(mar=c(1,1,1,1))
plot(st4, vertex.color=vcol, edge.color=ecol,
     edge.width=esize, vertex.size = vsize,
     vertex.label.cex = 0.5)

all distances

distances(st4)

diameter = longest distance

diameter(st4)
[1] 10
Your turn

What is the diameter of the graph? How long is the shortest path from Jon to Daenerys? What is the path that links Jon to Daenerys?

Ego

eg <- make_ego_graph(st4, nodes = "DARTH VADER", order = 1)
plot(eg[[1]])

Your turn

What is the degree of Jon ? Plot the ego network of degree 1 of Jon.

Modularity

Cliques

Cliques = complete subgraphs of an undirected graph.

cl <- cliques(st4, min = 4) 
head(cl)
[[1]]
+ 4/21 vertices, named, from 99191b2:
[1] R2-D2 C-3PO LUKE  OWEN 

[[2]]
+ 4/21 vertices, named, from 99191b2:
[1] R2-D2 C-3PO LUKE  LEIA 

[[3]]
+ 4/21 vertices, named, from 99191b2:
[1] LUKE        GOLD LEADER WEDGE       RED LEADER 

[[4]]
+ 4/21 vertices, named, from 99191b2:
[1] C-3PO      LUKE       LEIA       RED LEADER

[[5]]
+ 4/21 vertices, named, from 99191b2:
[1] LUKE        DODONNA     GOLD LEADER WEDGE      

[[6]]
+ 4/21 vertices, named, from 99191b2:
[1] DARTH VADER LEIA        MOTTI       TARKIN     
cl <- largest_cliques(st4)

vcol <- rep("grey80", vcount(st4))
vcol[unlist(largest_cliques(st4))] <- "gold"

par(mar=c(1,1,1,1))
plot(st4, vertex.color = vcol, vertex.size = 4)

induced_subgraph(graph = st4, vids = unlist(cl)) %>% 
  plot(main = "Largest clique (subgraph)", layout = layout_in_circle)

Your turn

What is the largest clique ? Plot it.

Community detection

cluster <- cluster_walktrap(st4) # louvain, edge_betweenness, ...
cluster
IGRAPH clustering walktrap, groups: 3, mod: 0.15
+ groups:
  $`1`
   [1] "R2-D2"     "CHEWBACCA" "C-3PO"     "LUKE"      "LEIA"      "BERU"     
   [7] "OWEN"      "OBI-WAN"   "HAN"       "GREEDO"    "JABBA"    
  
  $`2`
  [1] "CAMIE"       "BIGGS"       "DODONNA"     "GOLD LEADER" "WEDGE"      
  [6] "RED LEADER"  "RED TEN"    
  
  $`3`
  [1] "DARTH VADER" "MOTTI"       "TARKIN"     
  + ... omitted several groups/vertices
class(cluster)
[1] "communities"
dendPlot(cluster, mode="hclust")

par(mar=c(1,1,1,1))
plot(cluster, st4, layout=layout_nicely, vertex.label.cex = 0.5, 
     vertex.shape="none")

Your turn

From your graph, identify modules. How many modules are there? How many characters does the largest module contain?

Display the distance dendrogram. Display the module in the graph.